109 research outputs found

    An artificial neural network for estimating haplotype frequencies

    Get PDF
    The problem of estimating haplotype frequencies from population data has been considered by numerous investigators, resulting in a wide variety of possible algorithmic and statistical solutions. We propose a relatively unique approach that employs an artificial neural network (ANN) to predict the most likely haplotype frequencies from a sample of population genotype data. Through an innovative ANN design for mapping genotype patterns to diplotypes, we have produced a prototype that demonstrates the feasibility of this approach, with provisional results that correlate well with estimates produced by the expectation maximization algorithm for haplotype frequency estimation. Given the computational demands of estimating haplotype frequencies for 20 or more single-nucleotide polymorphisms, the ANN approach is promising because its design fits well with parallel computing architectures

    Big data driven co-occurring evidence discovery in chronic obstructive pulmonary disease patients

    Full text link
    © 2017, The Author(s). Background: Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung disease that affects airflow to the lungs. Discovering the co-occurrence of COPD with other diseases, symptoms, and medications is invaluable to medical staff. Building co-occurrence indexes and finding causal relationships with COPD can be difficult because often times disease prevalence within a population influences results. A method which can better separate occurrence within COPD patients from population prevalence would be desirable. Large hospital systems may potentially have tens of millions of patient records spanning decades of collection and a big data approach that is scalable is desirable. The presented method, Co-Occurring Evidence Discovery (COED), presents a methodology and framework to address these issues. Methods: Natural Language Processing methods are used to examine 64,371 deidentified clinical notes and discover associations between COPD and medical terms. Apache cTAKES is leveraged to annotate and structure clinical notes. Several extensions to cTAKES have been written to parallelize the annotation of large sets of clinical notes. A co-occurrence score is presented which can penalize scores based on term prevalence, as well as a baseline method traditionally used for finding co-occurrence. These scoring systems are implemented using Apache Spark. Dictionaries of ground truth terms for diseases, medications, and symptoms have been created using clinical domain knowledge. COED and baseline methods are compared using precision, recall, and F1 score. Results: The highest scoring diseases using COED are lung and respiratory diseases. In contrast, baseline methods for co-occurrence rank diseases with high population prevalence highest. Medications and symptoms evaluated with COED share similar results. When evaluated against ground truth dictionaries, the maximum improvements in recall for symptoms, diseases, and medications were 0.212, 0.130, and 0.174. The maximum improvements in precision for symptoms, diseases, and medications were 0.303, 0.333, and 0.180. Median increase in F1 score for symptoms, diseases, and medications were 38.1%, 23.0%, and 17.1%. A paired t-test was performed and F1 score increases were found to be statistically significant, where p < 0.01. Conclusion: Penalizing terms which are highly frequent in the corpus results in better precision and recall performance. Penalizing frequently occurring terms gives a better picture of the diseases, symptoms, and medications co-occurring with COPD. Using a mathematical and computational approach rather than purely expert driven approach, large dictionaries of COPD related terms can be assembled in a short amount of time

    A method to detect single-nucleotide polymorphisms accounting for a linkage signal using covariate-based affected relative pair linkage analysis

    Get PDF
    We evaluate an approach to detect single-nucleotide polymorphisms (SNPs) that account for a linkage signal with covariate-based affected relative pair linkage analysis in a conditional-logistic model framework using all 200 replicates of the Genetic Analysis Workshop 17 family data set. We begin by combining the multiple known covariate values into a single variable, a propensity score. We also use each SNP as a covariate, using an additive coding based on the number of minor alleles. We evaluate the distribution of the difference between LOD scores with the propensity score covariate only and LOD scores with the propensity score covariate and a SNP covariate. The inclusion of causal SNPs in causal genes increases LOD scores more than the inclusion of noncausal SNPs either within causal genes or outside causal genes. We compare the results from this method to results from a family-based association analysis and conclude that it is possible to identify SNPs that account for the linkage signals from genes using a SNP-covariate-based affected relative pair linkage approach

    The effects of a 6-week strength training on critical velocity, anaerobic running distance, 30-m sprint and yo-yo intermittent running test performances in male soccer players

    Get PDF
    The objectives of this study were to examine the effects of a moderate intensity strength training on changes in critical velocity (CV), anaerobic running distance (D'), sprint performance and Yo-Yo intermittent running test (Yo-Yo IR1) performances. Methods: two recreational soccer teams were divided in a soccer training only group (SO; n = 13) and a strength and soccer training group (ST; n = 13). Both groups were tested for values of CV, D', Yo-Yo IR1 distance and 30-m sprint time on two separate occasions (pre and post intervention). The ST group performed a concurrent 6-week upper and lower body strength and soccer training, whilst the SO group performed a soccer only training. Results: after the re-test of all variables, the ST demonstrated significant improvements for both, YoYo IR1 distance (p = 0.002) and CV values (p<0.001) with no significant changes in the SO group. 30-m sprint performance were slightly improved in the ST group with significantly decreased performance times identified in the SO group (p<0.001). Values for D' were slightly reduced in both groups (ST -44.5 m, 95% CI = -90.6 to 1.6; SO -42.6 m, 95% CI = -88.7 to 3.5). Conclusions: combining a 6-week moderate strength training with soccer training significantly improves CV, Yo-Yo IR1 whilst moderately improving 30-m sprint performances in non-previously resistance trained male soccer players. Critical Velocity can be recommended to coaches as an additional valid testing tool in soccer

    Patellofemoral pain syndrome (PFPS): a systematic review of anatomy and potential risk factors

    Get PDF
    Patellofemoral Pain Syndrome (PFPS), a common cause of anterior knee pain, is successfully treated in over 2/3 of patients through rehabilitation protocols designed to reduce pain and return function to the individual. Applying preventive medicine strategies, the majority of cases of PFPS may be avoided if a pre-diagnosis can be made by clinician or certified athletic trainer testing the current researched potential risk factors during a Preparticipation Screening Evaluation (PPSE). We provide a detailed and comprehensive review of the soft tissue, arterial system, and innervation to the patellofemoral joint in order to supply the clinician with the knowledge required to assess the anatomy and make recommendations to patients identified as potentially at risk. The purpose of this article is to review knee anatomy and the literature regarding potential risk factors associated with patellofemoral pain syndrome and prehabilitation strategies. A comprehensive review of knee anatomy will present the relationships of arterial collateralization, innervations, and soft tissue alignment to the possible multifactoral mechanism involved in PFPS, while attempting to advocate future use of different treatments aimed at non-soft tissue causes of PFPS

    Sample treatment for tissue proteomics in cancer, toxicology, and forensics

    Get PDF
    Since the birth of proteomics science in the 1990, the number of applications and of sample preparation methods has grown exponentially, making a huge contribution to the knowledge in life science disciplines. Continuous improvements in the sample treatment strategies unlock and reveal the fine details of disease mechanisms, drug potency, and toxicity as well as enable new disciplines to be investigated such as forensic science. This chapter will cover the most recent developments in sample preparation strategies for tissue proteomics in three areas, namely, cancer, toxicology, and forensics, thus also demonstrating breath of application within the domain of health and well-being, pharmaceuticals, and secure societies. In particular, in the area of cancer (human tumor biomarkers), the most efficient and multi-informative proteomic strategies will be covered in relation to the subsequent application of matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI) and liquid extraction surface analysis (LESA), due to their ability to provide molecular localization of tumor biomarkers albeit with different spatial resolution. With respect to toxicology, methodologies applied in toxicoproteomics will be illustrated with examples from its use in two important areas: the study of drug-induced liver injury (DILI) and studies of effects of chemical and environmental insults on skin, i.e., the effects of irritants, sensitizers, and ionizing radiation. Within this chapter, mainly tissue proteomics sample preparation methods for LC-MS/MS analysis will be discussed as (i) the use of LC-MS/MS is majorly represented in the research efforts of the bioanalytical community in this area and (ii) LC-MS/MS still is the gold standard for quantification studies. Finally, the use of proteomics will also be discussed in forensic science with respect to the information that can be recovered from blood and fingerprint evidence which are commonly encountered at the scene of the crime. The application of proteomic strategies for the analysis of blood and fingerprints is novel and proteomic preparation methods will be reported in relation to the subsequent use of mass spectrometry without any hyphenation. While generally yielding more information, hyphenated methods are often more laborious and time-consuming; since forensic investigations need quick turnaround, without compromising validity of the information, the prospect to develop methods for the application of quick forensic mass spectrometry techniques such as MALDI-MS (in imaging or profiling mode) is of great interest

    Latent topic ensemble learning for hospital readmission cost reduction

    Full text link
    © 2017 IEEE. Unplanned hospital readmission is a costly problem in the United States. Patients treated and readmitted within 30 days cost tax payers up to $26 billion annually. In 2013 the U.S. federal government began to reduce payments to hospitals with excessive patient readmissions. Predictive modeling using machine learning can be a useful tool to help identify patients most likely to need readmission. However, current systems have several shortcomings. When creating predictive models for hospital readmission, existing methods either build models using data from a single hospital or naively combining data from multiple hospitals. Because hospitals often have different data distributions, models created from a single hospital's data are often biased. Additionally, models created from combined data overlook local data distributions. In this paper, we propose, LTEL, which uses an ensemble of topic specific models to leverage data from multiple hospitals. LTEL creates models based on latent topics derived from different hospitals. Models are built and evaluated incorporating federal financial penalties. The dataset contains data collected from 16 regional hospitals. Compared to baseline methods, LTEL significantly outperforms the best performing baseline method for cost reduction

    Co-occurring evidence discovery for COPD patients using natural language processing

    Full text link
    © 2017 IEEE. Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung disease that affects airflow to the lungs. Discovering the co-occurrence of COPD with other diseases and symptoms is invaluable to medical staff. Building co-occurrence indexes and finding causal relationships with COPD can be difficult because often times disease prevalence within a population influences results. A method which can better separate occurrence within COPD patients from population prevalence would be desirable. Natural Language Processing (NLP) methods are used to examine 64,371 deidentified clinical notes and discover associations between COPD and medical terms. A co-occurrence score is presented which can penalize scores based on term prevalence. The maximum improvements in recall for symptoms and diseases were 0.212 and 0.130. The maximum improvements in precision for symptoms and diseases were 0.303 and 0.333
    • …
    corecore